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Abstract 

We consider the problem of testing a particular type of composite null hypothesis under a 
nonparametric multivariate regression model. For a given quadratic functional Q, the null 
hypothesis states that the regression function / satisfies the constraint Q[f] = 0, while the 
alternative corresponds to the functions for which Q[f] is bounded away from zero. On the 
one hand, we provide minimax rates of testing and the exact separation constants, along 
with a sharp-optimal testing procedure, for diagonal and nonnegative quadratic function- 
als. We consider smoothness classes of ellipsoidal form and check that our conditions are 
fulfilled in the particular case of ellipsoids corresponding to anisotropic Sobolev classes. 
In this case, we present a closed form of the minimax rate and the separation constant. 
On the other hand, minimax rates for quadratic functionals which arc neither positive nor 
negative makes appear two different regimes: "regular" and "irregular" . In the "regular" 
case, the minimax rate is equal to n^ 1 / 4 while in the "irregular" case, the rate depends 
on the smoothness class and is slower than in the "regular" case. We apply this to the 
issue of testing the equality of norms of two functions observed in noisy environments. 

AMS 2000 subject classifications: Primary 62G08, 62G10; secondary 62G20. 
Keywords and phrases: Nonparametric hypotheses testing, sharp asymptotics, sepa- 
ration rates, minimax approach, high-dimensional regression. 

1. Introduction 

1.1. Problem statement 

Consider the nonparametric regression model with multi-dimensional random design: We 
observe (xi,ti)i=i y __ in obeying the relation 

s* = /(ti) + &! i = l,...,n, (1) 

where tj 6 A C l d are random design points, 1 < <i < oo, / : A — >• R is the unknown 
regression function and £jS represent observation noise. Throughout this work, we assume 
that the vectors tj = (tj, . . . ,tf), for i = 1, . . . , n, are independent and identically distributed 

with uniform distribution on A = [0, l] d , which is equivalent to $ ~ U(0,1). Furthermore, 
conditionally on T n = {ti, . . . , t n }, the variables £i, ■ ■ ■ , £ n are assumed i.i.d. with zero mean 
and variance r 2 , for some known r € (0, oo). 

Let Li (A) denote the Hilbert space of all squared integrable functions defined on A. Assume 
that we are given two disjoint subsets J-q and T\ of ^(A). We are interested in analyzing 
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the problem of testing hypotheses: 

H :feT against H x : / € T x . (2) 

To be more precise, let us set Zj = (xj,tj) and denote by Pf be the probability distribution 
of the data vector (z x , ... , z n ) given by (1). The expectation with respect to Pf is denoted 
by Et. The goal is to design a testing procedure 4> n : (K x A) n — > {0, 1} for which we are 
able to establish theoretical guarantees in terms of the cumulative error rate (the sum of the 
probabilities of type I and type II errors): 

TnC^O, Fl, 4>n) = SUp Pf((p n = 1) + SUp Pf(tj> n = 0). (3) 

To measure the statistical complexity of this testing problem, it is relevant to analyze the 
minimax error rate 

7 n (J r o,J 7 i) = inf 7 n (J r o,^ 7 i,0n), (4) 
where inf^^ denotes the infimum over all testing procedures. 

The focus in this paper is on a particular type of null hypotheses Hq that can be defined as 
the set of functions lying in the kernel of some quadratic functional Q : L%(A) — > M, i.e., 
To C {/ G £2 (A) : Q[f] = 0}. As described later in this section, this kind of null hypotheses 
naturally arises in several problems including variable selection, testing partial linearity of a 
regression function or the equality of norms of two signals. Then, it is appealing to define the 
alternative as the set of functions satisfying > P 2 f° r some p > 0. However, without 

further assumptions on the nature of functions /, it is impossible to design consistent testing 
procedures for discriminating between Pq and P\. One approach to making the problem 
meaningful is to assume that the function / belongs to a smoothness class. Typical examples 
of smoothness classes are Sobolev and Holder classes, Besov bodies or balls in reproducing 
kernel Hilbert spaces. 

In the present work, we assume that the function / belongs to a smoothness class S that 
can be seen as an ellipsoid in the infinite-dimensional space L2(A). Thus, the null and the 
alternative are defined by 

T = {/ G S : Q[f] = 0}, T x = Ti(p) = -|/ G S : \Q[f)\ > p 2 ). (5) 

One can take note that both hypotheses are composite and nonpar ametric. 

1.2. Background on minimax rate- and sharp- optimality 

Given the observations (xj, ti)i = i n , we consider the problem of testing the composite hy- 
pothesis Tq against the nonparametric alternative T\(p) defined by (5). The goal here is to 
obtain, if possible, both rate and sharp asymptotics for the cumulative error rate in the min- 
imax setup. These notions are defined as follows. For a fixed small number 7 G (0,1), the 
function r* is called minimax rate of testing if: 

• there exists C > such that VC < C, we have liminf "Yn(To, T x (Cr*)) > 7, 

n— >oc 

• there exists C > and a test (p n such that VC > C , lim sup 7 n ( To , Ti(Cr^), 4> n ) < 7- 
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A testing procedure <p n is called minimax rate-optimal if limsup^^^ 7 n (J r o, J-i(Cr*), qb n ) < 7 
for some C > 0. Note that the minimax rate and the rate-optimal test may depend on the 
prescribed significance level 7. However, in most situations this dependence cancels out from 
the rate and appears only in the constants. If the constants C and C" coincide, then their 
common value is called exact separation constant and any test satisfying the second condition 
is called minimax sharp optimal. The minimax rate r* is actually not uniquely defined, but 
the product of the minimax rate with the exact separation constant is uniquely defined up 
to an asymptotic equivalence. For more details on minimax hypotheses testing we refer to 
(Ingster and Suslina, 2003). 

While minimax rate-optimality is a desirable feature for a testing procedure, it may still lead 
to overly conservative tests. A (partial) remedy for this issue is to consider sharp asymptotics 
of the error rate. In fact, one can often prove that when n — > 00, 

7n(^o,^i(p)) =2*(-« n (p)) + o(l), (6) 

where $ is the c.d.f. of the standard Gaussian distribution, u n (-) is some "simple" function 
from M + to R and o(l) is a term tending to zero uniformly in p as n — > 00. This relation 
implies that by determining r* as a solution with respect to p to the equation u n (p) = 2i_ 7 /2 — 
where z a stands for the a-quantile of the standard Gaussian distribution — we get not only 
the minimax rate, but also the exact separation constant. When relation (6) is satisfied, we 
say that Gaussian asymptotics hold. 

1.3. Overview of the main contributions 

Our contributions focus on the case where the smoothness class £ is an ellipsoid in L2(A) and 
the quadratic functional Q admits a diagonal form in the orthonormal basis corresponding to 
the directions of the axes of the ellipsoid S. To be more precise, let £ be a countable set and 
{<Pl}leC be an orthonormal system in Z,2(A). For a function / G L2(A), let 6[f] = {#/[/]};<=£ 
be the generalized Fourier coefficients with respect to this system, i.e., 0i[f] = {f,(pi}, where 
(•, •) denotes the inner product in Z^fA). The functional sets £ C £2 (A) under consideration 
are subsets of ellipsoids with directions of axes {(fi}i & c and with coefficients c = {q}z £ £ G R+: 

*^{f = E lec ^r. E lec «m 2 <i}. (7) 

The diagonal quadratic functional is defined by a set of coefficients q = {qi}i£C : Q[f] = 
J2i<=c H@i[f] 2 '• Note that if Q is definite positive, i.e., qi > for all / G C, then the null 
hypothesis becomes / = and the problem under consideration is known as detection problem. 
However, the goal of the present work is to consider more general types of diagonal quadratic 
functionals. Namely, two situations are examined: (a) all the coefficients qi are nonnegative 
and (b) the two sets C + = {I G C : qi > 0} and £_ = {I G C : q\ < 0} are nonempty. 
In the first situation, we establish Gaussian asymptotics of the cumulative error rate and 
propose a minimax sharp-optimal test. Under some conditions, we show that the sequence 1 

r* =min^p>0: inf ||v||| > 8n _2 z 1 _ 7 / 2 > (8) 

" I- veK£:(v,c}<l;{v,q}>p 2 J 

We denote by || • |L and by (■, ■} the usual norm and the inner product in £2{C), the space of squared 
summable arrays indexed by C. 
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provides minimax rate of testing with constants C = C" = 1. This result is instantiated to 
some examples motivating our interest for testing the hypotheses (5). One example, closely re- 
lated to the problem of variable selection (Comminges and Dalalyan, 2011), is testing the rele- 
vance of a particular covariate in high-dimensional regression. This problem is considered in a 
more general setup corresponding to testing that a partial derivative of order a = (ot\ , ad) , 
denoted by d ai+ ' +ad f/dt® 1 . . . dt^ d , is identically equal to zero against the hypothesis that 
this derivative is significantly different from 0. As a consequence of our main result, we 
show that if / lies in the anisotropic Sobolev ball of smoothness cr = (a\, . . . , ad), and we set 
5 = J2i=i a i/ a i, & = (a Yli=i then the minimax optimal-rate is r* = n - 2a ( 1 - s )/( ia + d ) 

provided that 5 < 1 and a > d/4. Furthermore, we derive Gaussian asymptotics and exhibit 
the exact separation constant in this problem. 

The second situation we examine in this paper concerns the case where the cardinalities 
of both £ + and £_ are nonzero. A typical application of this kind of problem is testing the 
equality of the norms of two signals observed in noisy environments. In this set-up, we provide 
minimax rates of testing and exhibit the presence of two regimes that we call regular regime 
and irregular regime. In the regular regime, the minimax rate is r* = ra -1 / 4 , while in the 
irregular case it may be of the form n~ a with an a < 1/4 that depends on the degree of 
smoothness of the functional class. 

Note that all our results are non-adaptive: our testing procedures make explicit use of the 
smoothness characteristics of the function /. Adaptation to the unknown smoothness for the 
problem we consider is an open question for which the works (Spokoiny, 1996, Gayraud and Pouet, 
2005) may be of valuable guidance. 

1.4- Relation to previous work 

Starting from the seminal papers by Ermakov (1990) and Ingster (1993a, b,c), minimax testing 
of nonparametric hypotheses received a great deal of attention. A detailed review of the 
literature on this topic being out of scope of this section, we only focus on discussing those 
previous results which are closely related to the present work. The goal here is to highlight 
the common points and the most striking differences with the existing literature. 
Note that the major part of the statistical inference for nonparametric hypotheses testing 
was developed for the Gaussian white noise model (GWNM) and its equivalent formulation 
as Gaussian sequence model (GSM). As recent references for the problem of testing a simple 
hypothesis in these models, we cite (Ermakov, 2011, Ingster et al., 2012), where the reader 
may find further pointers to previous work. In the present work, the null hypothesis defined by 
(5) is composite and nonparametric. Early references for minimax results for composite null 
hypotheses include (Horowitz and Spokoiny, 2001, Pouet, 2001, Gayraud and Pouet, 2001, 
2005), where the case of parametric null hypothesis is of main interest. These papers deal with 
the one-dimensional situation and provide only minimax rates of testing without attaining 
the exact separation constant. Furthermore, the alternative is defined as the set of functions 
that are at least at a Euclidean distance p from the null hypothesis, which is very different 
from the alternatives considered in this work. 

More recently, nonasymptotic approach to minimax testing gained popularity (Baraud et al., 
2003, 2005, Laurent et al., 2011, 2012). One of the advantages of the nonasymptotic approach 
is that it removes the frontier between the concepts of parametric and nonparametric hypothe- 
ses, while its limitation is that there is no result on sharp optimality (even the notion itself 
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is not well defined). Note also that all these papers deal with the GSM considering as main 
application the case of one dimensional signals, as opposed to our set-up of regression with 
high-dimensional covariates. 

Let us review in more details the papers (Ingster and Sapatinas, 2009) and (Laurent et al., 
2011) that are very closely related to our work either by the methodology which is used or by 
the problem of interest. Ingster and Sapatinas (2009) extended some results on the goodness- 
of-fit testing for the d-dimensional GWNM to the goodness-of-fit testing for the multivariate 
nonparametric regression model. More precisely, they tested the null hypothesis Hq f = fo, 
where fo is a known function, against the alternative H\ : f £ S, f^(f — fo) 2 > r^, where 
S is an ellipsoid in the Hilbert space £2 (A) with respect to the tensor product Fourier basis 
(with extensions to other bases). They obtained both rate and sharp asymptotics for the error 
probabilities in the minimax setup. So the model they considered is the same as the one we are 
interested in here, but the hypotheses Ho and H\ are substantially different. As a consequence, 
the testing procedure we propose takes into account the general forms of Ho and H\ given 
by (5) and is different from the asymptotically minimax test of Ingster and Sapatinas (2009). 
Furthermore, we substantially relaxed the contraint on the noise distribution by replacing 
Gaussianity assumption by the condition of bounded 4th moment. 

Laurent et al. (2011) considered the GWNM from the inverse problem point of view, i.e., 
when the signal of interest g undergoes a linear transformation T before being observed in 
noisy environment. This corresponds to / = T[g] with a compact injective operator T. Then 
the two assertions g = and T[g] =0 are equivalent. Consequently, if the goal is to detect 
the signal /, one can consider the two testing problems : 

1. (inverse formulation) Ho : T _1 [/] = against H\ : ||T _1 [/]||2 > p. 

2. (direct formulation) Ho : / = against H\ : H/H2 > P- 

The authors discussed advantages and limitations of each of these two formulations in terms of 
minimax rates. Depending on the complexity of the inverse problem and on the assumptions on 
the function to be detected (sparsity or smoothness), they proved that the specific treatment 
devoted to inverse problem which includes an underlying inversion of the operator, may worsen 
the detection accuracy. For each situation, they also highlighted the cases where the direct 
strategy fails while a specific test for inverse formulation works well. The inverse formulation 
is closely related to our definition (5) of the hypotheses Hq and Hi, since Q[f] = H? 1-1 [/] Hi is 
a quadratic functional. However, our setting is more general in that we consider functionals 
with non-trivial kernels and with possibly negative diagonal entries. 

1 . 5. Organization 

The rest of the paper is organized as follows. The results concerning sharp asymptotics for 
positive semi-definite diagonal functionals are provided in Section 2. In particular, the rates 
of separation for a general class of tests called linear U-tests are explored in Subsection 2.2. 
The asymptotically optimal linear U-test is provided in Subsection 2.3 along with its rate of 
separation, which is shown to coincide with the minimax exact rate in Subsection 2.4. Section 3 
is devoted to a discussion of the assumptions and to the consequences of the main result 
for some relevant examples. The results for nonpositive and nonnegative diagonal quadratic 
functionals are stated in Section 4 along with an application to testing the equality of the 
norms of two signals. Finally, the proofs of the results are postponed to the Appendix. 
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2. Minimax testing for nonnegative quadratic functionals 

2.1. Additional notation 

In what follows, the notation A n = 0(B n ) means that there exists a constant c > such 
that A n < cB n and the notation A n = o{B n ) means that the ratio A n /B n tends to zero. The 
relation A n ~ B n means that A n /B n tends to 1, while the relation A n x B n means that there 
exist constants < c\ < C2 < oo and no large enough such that c\ < A n /B n < C2 for n > uq. 
For a real number c, we denote by c+ its positive part max(0, c) and by [c\ its integer part. 
For a set A, 1a stands for its indicator function and \A\ denotes its cardinality. Given a q > 
and a function /, = (J\ \f{t)\ q dt) l ^ q is the conventional £ g -norm of /. Similarly, for a 
vector or an array u indexed by a countable set C, \\u\\ q = (X^e£ \ u i\ q ) 1 ^ q ls ^ ne ^g-norm of 
u. As usual, we also denote by ||u||o and || u. 1 1 oo ? respectively, the number of nonzero entries 
and the magnitude of the largest entry of u 6 W~ . 

In the sequel, without loss of generality, we assume that the standard deviation of the noise 
is equal to one: r = 1. The case of general but known r will be formulated as a consequence. 
Recall that we consider quadratic functionals Q of the form Q[f] = Yllec QfllifT \ f° r some 
given array q = {qi}i<=c- The major difference between the functional JZze£ @iif] 2 *hat a PP ears 
in the problem of detection (Ingster and Sapatinas, 2009, Ingster et al., 2012) and this general 
functional actually lies in the fact that the support of q defined by Sf = supp(q) = ll 6 
C : qi 0} is generally different from C. Furthermore, large coefficients qi amplify the error 
of estimating Q[f] and, therefore, it becomes more difficult to distinguish Hq from H\. An 
interesting question, to which we answer in the next sections, is what is the interplay between 
c and q that makes it possible to distinguish between the null and the alternative. 
Let S C F denote the complement of Sf and, for a set L C C, span({</^}z e £j be the closed linear 
subspace of L,2(A) spanned by the set {^Pi}i^l- Let Hs F f and Use / be the orthogonal pro- 
jections of a function / G £ on span({(/?/}/ g s F ) and span({<^}z e s^) respectively. To simplify 
notation, the subscript S F is omitted in the rest of the paper, i.e., ^s^f is replaced by II/. 
Finally, throughout this work we will assume that / is centered, i.e., J A f(t) dt = 0, and that 
{<Pi} is an orthonormal basis of the subspace of ^(A) consisting of all centered functions. In 
other terms, all the functions ipi are orthogonal to the constant function. 

2.2. Linear U-tests and their error rate 

We start by introducing a family of testing procedures that we call linear U-tests. To this end, 
we split the sample into two parts: a small part of the sample is used to build a pilot estimator 
II/ n of II/, whereas the remaining observations are used for distinguishing between Hq and 
H\. Let us set m = n — \_\fn\ and call the two parts of the sample T>\ = {(x{, tj) : i = 1, . . . , m} 
and T>2 = {(xj,tj) : i = m + 1, . . . , n}. Using a pilot estimator II/ n of II/, we define the 
adjusted observations Xi = xi — II/ n (tj) and Zj = (xj,tj). 

Definition 1. Let w n = {wi tn }i^s F be an array of real numbers containing a finite number 
of nonzero entries and such that ||w ra ||2 = 1. Let u be a real number. We call a linear U-test 
based on the array w n the procedure <f>™ = ljjyw >u j, where U n is the linear in w n U-statistic 
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defined by 

1/2 

I] ^.nW^^lfe)- ( 9 ) 

l<i<j<m leS F 

We shall prove that an appropriate choice of w n and u leads to a linear U-test that is asymptot- 
ically sharp-optimal. The rationale behind this property relies on the by now well-understood 
principle of smoothing out high frequencies of a noisy signal. In fact, if we call {9i[f]}ies F ^ ne 
(relevant part of the) representation of / in the frequency domain, then YliLi ^wi^i^ieSp 
is a nearly unbiased estimator of this representation. Then, the array w n acts as a low pass 
filter that shrinks to zero the coefficients corresponding to high frequencies in order to prevent 
over-fitting. 

The first step in establishing theoretical guarantees on the error rate of a linear U-test consists 
in exploring the behavior of the statistic U n under the null. 

Proposition 1. Let w U; i > for all n G N and I G C Assume that E[£f] < oo and the 
following conditions are fulfilled: 

• For some C w < oo, ||w n ||;yw n ||o < C w . 

• As n — > oo, ||w n ||o —> oo so that ||w n ||o = o(n). 

• For some C v < oo, sup t6A Ez^^o <^ 2 (t) < C ¥ ,||w n || . 

• As n ^ oo, sup /eS E f [\\Uf - Uf n \\f] = o(l). 

Then, uniformly in f G Tq, the U-statistic defined by (9) converges in distribution to the 
standard Gaussian distribution N (0,1). 

In other terms, this proposition claims that under appropriate conditions, for every u G R, 
the sequence supj g jr \Pf(U n > u) — &(u)\ tends to zero, as n goes to infinity. This means 
that under the null, the distribution of the test statistic U n is asymptotically parameter free. 
This is frequently referred to as Wilks' phenomenon. 

To complete the investigation of the error rate of a linear U-test, we need to characterize 
the behavior of the test statistic U n under the alternative. As usual, this step is more in- 
volved. Roughly speaking, we will show that under the alternative the test statistic U n is 
close to a Gaussian random variable with mean h n [f, w n ] = ( m -(^~ 1 ) ) l / 2 Y^i & c(vin) w hn^i[f] 
and variance 1. The rigorous statement is provided in the next proposition. 

Proposition 2. Let the assumptions of Proposition 1 be satisfied. Assume that in addition: 

• There exists a sequence ( n such that (~ 1 = o(n) and swpi eSF . Wj n< ^ n q" 1 = o(l). 

• For some p > 4, we have supj gS ||ns F /|| p < oo. 

Then, for every p > 0, the type IL error of the linear U-test based on w„ satisfies: 

sup /e-Fl(p) Pf(<j% = 0) < sup /e-Fl(p) *(« - h n [f,w n \) + o(l), (10) 
where the term o(l) does not depend on p. 

Let us provide an informal discussion of the assumptions introduced in the previous propo- 
sitions. The first two assumptions in Proposition 1 mean that most nonzero entries of the 
array w n should be of the same order. Arrays that have a few spikes and many small entries 
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are discarded by these assumptions. Furthermore, the number of samples in the frequency 
domain that are not annihilated by w n should be small as compared to the sample size n. 
The third assumption of Proposition 1 is trivially satisfied for bases of bounded functions 
such as sine and cosine bases and their tensor products. For localized bases like wavelets, this 
assumption imposes a constraint on the size of the support of w n : it should not be too small. 
The last assumption of Proposition 1 will be discussed in more detail later. One should also 
take note that the only reason for requiring from the functions / to be smooth under the null 
is the need to be able to construct a uniformly consistent pilot estimator of Ylf. 
Concerning the assumptions imposed in Proposition 2, the first one means that only co- 
efficients 9i corresponding to high frequencies are strongly shrunk by w n . This is a kind of 
coherence assumption between the smoothing filter w n and the coefficients c = {c/}; g £ encod- 
ing the prior information on the signal smoothness. The second assumption of Proposition 2 
is rather weak and usual in the context of regression with random design. It is only needed 
for getting uniform control of the error rate and the actual value of the norm ||ris F /||p does 
not enter in any manner in the definition of the testing procedure. 

Let us draw now the consequences of the previous propositions on the cumulated error rate 
of a linear U-test. Using the monotonicity of the Gaussian c.d.f. <£, under the assumptions of 
Proposition 2, we get 

7n(^0,^i(p),O < *(-«) + $(« - inf/eFiM M/, w„]) + o(l), (11) 

where the term o(l) is uniform in p > 0. Using the symmetry of <I> and the monotonicity 
of <J>' on R + , one easily checks that the value of the threshold u minimizing the main term 
in the right-hand side of the last display is u = \ inf r p \ h n [/, w n ] . This result provides a 
constructive tool for determining the rate of separation of a given linear U-test. In fact, one 
only needs to set u = £i_ 7 /2 and find a sequence r n such that inif e j7 1 r rn \ h n [f, w n ] ~ 2z 1 _ 7 / 2 , 
where z a is the a-quantile of Af(0, 1). 

Remark 1. We explain here the use of Xi instead of X{ in our testing procedure. Actually if 
we were only interested in rate-optimality, this precaution would not have been necessary. 
The problem only arises when dealing with sharp-optimality and it concerns the variance of 
U n . Indeed we need some terms that appear in the variance to tend to zero when Q[f] = 
or Q[f] is small (those terms only need to be bounded for the rate-optimality). If we had 
used Xi instead of x^, we would have ended up with terms like ||/||2 m the variance. The 
information contained in the assertion ll Q[f] is small" concerns only the coefficients {#;};es F , 
thus it implies that ||n,s F /||2 is small but it does not say anything about ||/||2- We can also 
remark that this problem does not arise in the Gaussian sequence model as one estimates df 
by an unbiased estimator whose variance makes appear only Q\. 

Remark 2. We chose to consider only the criterion 'y n (J r o,J r i(p),(p^) so as to simplify the 
exposition of our results. But we could have dealt with the classical Neyman- Pearson criterion 
that we recall here. For a significance level < a < 1 and a test ip, we set 

a(T ,ip) = sup /eJ - P f (tp = l), fi(Fi,^) = inf^sup /£Fl P f (ip = 0), 

Instead of the minimax risk 7 n (-7 r (b ^i{p)) we could have considered the quantity j3 n {Fo, Flip)) = 
inf^. a (j- ^)< a PiFiip), ip)- This criterion is considered in Ingster and Sapatinas (2009) and 
more generally in Ingster and Suslina (2003). The transposition to our case is straightfor- 
ward. 
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The relation (11) being valid for a large variety of arrays w n , it is natural to look for a w n 
minimizing the right-hand side of (11). This leads to the following saddle point problem: 

sup inf wi9i[f] 2 = sup inf (w, v). (12) 

weR £ f^Mf^t. ' weK £ veK£ 

||w|| a =l ||w|| 2 =l ( v . c >< 1 .< v .q)>p 2 

It turns out that this saddle point problem can be solved with respect to w and leads to a 
one-parameter family of smoothing filters w. 

Proposition 3. Assume that for every T > 0, the set N{T) = {I G Sf ■ Q < T q{\ is finite. 
For a given p > 0, assume that the equation 

E;g£gzCTgz ~ci)+ = p2 _ 
J2i & c c i( T( li ~ 

has a solution and denote it by T p . Then, the pair (w*, v*) defined by 

«r = ^ {Tpq ! ~ Q)+ , wt = A- (14) 

£/ 6 £Q(7>-q)+ ' ||v*|| 2 1 ' 

provides a solution to the saddle point problem (12), that is 

(w*, v*) = sup wgR £ inf, (w,v)= inf, (w*,v) 



„ , , veKV vet, 

(v,c)<l,(v,q>>p 2 (v,c)<l,(v,q)>p 2 



This result tells us that the "optimal" weights w n for the linear U-test (f>™ should be of the 
form (14), which is particularly interesting because of its dependence on only one parameter 
T > 0. The next theorem provides a simple strategy for determining the minimax sharp- 
optimal test among linear U-tests satisfying some mild assumptions. We will show later in 
this section that this test is also minimax sharp-optimal among all possible tests. 

Theorem 1. Assume that E[£f\ < oo and for every T > 0, the setN(T) = {I G Sf '■ ci < Tq{\ 
is finite. For a prescribed significance level 7 G (0, 1), letT n ^ be a sequence of positive numbers 
such that the following relation holds true: as n — >■ 00, 



1) X)(rn, 7 «-Q)+) = (j2 C l( T n^l-Cl) + )(2 Zl ^ /2 +o(l)). (15) 
^ leC ' ^ lec ' 

Let us define 



1/2 



<7= V T • (16) 

// f/ie following conditions are fulfilled: 

[CI] For some constant C x > 0, \J\f (T n>7 ) | max feAf(Tn 7 ) of < Ci EzeAf(T n , 7 ) (« ~ T^f ■ 

[C2] As n -> 00, E«eAT(T„, 7 ) 9? = o{n 2 min lEN{Tn i) qf). 

[C3] For some constant C3 > ; sup tgA E/ e ^/(T n7 ) <^?(t) — C3|A/"(T nj7 )| . 
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[C4] As n -too, \N(T n>1 )\ -> oo so that \N{T nn )\ = o(n). 

[C5] ^4s n — > oo, T n ^m(i e s F Qi tends to +00. 

[C6] As n -> oo, su P/eE E/[||n/ - n/J|f] = 

[C7] For some p > 4, if holds that supy eS ||ns F /||p < 00. 

then the linear U-test <^>* = l{[/«* >2l , 2 } based on the array w* defined by 

[Ei'ec( T ».7«' -<*')+] 

satisfies 

7 n (J 7 o,-? 7 i«, 7 ),C) < 7 + o(l), as ra^oo. (17) 

The proof of this result, provided in the Appendix, is a direct consequence of Proposition 1, 
2 and 3. As we shall see below, the rate r* defined in Theorem 1 is the minimax sharp-rate 
in the problem of testing hypotheses (5), provided that the assumptions of the theorem are 
fulfilled. As expected, getting such a strong result requires non-trivial assumptions on the 
nature of the functional class, that of the hypotheses to be tested, as well as the interplay 
between them. Some short comments on these assumptions are provided in the remark below, 
with a further development left to subsequent sections. 

Remark 3. The very first assumption is that the set Af(T) is finite. It is necessary for ensuring 
that the linear U-test we introduced is computable. This assumption is fulfilled when, roughly 
speaking, the coefficients which express the regularity, {q}j e £, grow at a faster rate than the 
coefficients {qi}i^c of the quadratic functional Q. Assumptions [CI], [C2], [C4] and [C5] are 
satisfied in most cases we are interested in. Two illustrative examples — concerning Sobolev 
ellipsoids with quadratic functionals related to partial derivatives — for which these hypothe- 
ses are satisfied are presented in Subsections 3.3 and 3.4. Assumption [C3] is essentially a 
constraint on the basis {(fi}; we show in Subsection 3.1 that it is satisfied by many bases 
commonly used in statistical literature. [C6] and [C7] are related to additional technicali- 
ties brought by the regression model, which force us to impose more regularity than in the 
Gaussian sequence model. 

2.4- Lower bound 

We shall state in this section the result showing that the rate r* introduced in Theorem 1 
is the minimax rate of testing and the exact separation constant associated with this rate is 
equal to one. This also implies that the testing procedure proposed in previous subsection is 
not only minimax rate-optimal but also minimax sharp-optimal among all possible testing 
procedures. In this subsection, we consider the functional classes S = £ P) £ defined by 

Z P ,L = {f = ^2 leC 0i[f]vi: E^^^- 1 ' ll^^ L ' n ^/ = 0}- 

Clearly, for p > 4, this functional class is smaller than those satisfying conditions of Theorem 1. 
Therefore, any lower bound proven for these functional classes will also be a lower bound for 
the functional classes for which Theorem 1 is applicable. 
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Theorem 2. Assume that £jS are standard Gaussian random variables and that for every 
T > 0, the set Af(T) = {I G Sf : q < T^} is finite. For a prescribed significance level 
7 G (0, 1), let T„ j7 and r* 6e as m Theorem 1. If conditions [CI], [C3] and 

[C8] as n -> oo ; |jV(T n)7 )| -»■ oo so toaf |AA(T„ i7 )| log(|jV(T n , 7 )|) = o(n), 
[C9] os n -> oo, max ieA ^ (Tn 7 ) q = o{n\N{T nn )\ 1 / 2 ), 

are fulfilled, then for every C < 1 the minimax risk satisfies 

j n (To,Tx(Cr* n}1 )) >7 + o(l), as n ^ oo. (18) 

Although the main steps of the proof of this theorem, postponed to the Appendix, are close 
to those of (Ingster and Sapatinas, 2009), we have made several improvements which resulted 
in both shorter and more transparent proof and relaxed assumptions. The most notable 
improvement is perhaps the fact that in condition [C3] it is not necessary to have C3 = 1. 
We will further discuss this point and the other assumptions in the next section. 

Remark 4. If we were only interested in minimax rate-optimality, we could have used simpler 
prior in the proof of Theorem 2 which would also yield the desired lower bound under slightly 
weaker assumptions. One can also deduce from the proof that for a concrete pair (c, q), a 
simple way to figure out what is the minimax rate of separation consists in solving w.r.t. r n 
the relation n(r n ) 2 x A^r" 2 ) 1 / 2 , where M(T) = ^leAfCr) 9?- 



3. Examples 



3.1. Bases satisfying assumption [C3] 



First we give examples of orthonormal bases satisfying assumption [C3], irrespectively of the 
nature of arrays c and q defining the smoothness class and the quadratic functional Q. One can 
take note that despite more general settings considered in the present work, our assumption 
[C3] is significantly weaker than the corresponding assumption in (Ingster and Sapatinas, 
2009), which requires C3 to be equal to one. In fact, in a remark, Ingster and Sapatinas 
(2009) suggest that their proof remains valid under our assumption [C3] if assumption [C4] 
is strengthened to \J\f(T n ^)\ = o(n 2//3 ). Due to a better analysis, we succeeded to establish 
sharp asymptotics under the weak version of [C3] without any additional price (except that 
a logarithmic factor appears now in the corresponding condition in Theorem 2). 



Fourier basis Let us consider first the following Fourier basis in dimension d for which 
C = Z d and 



<Pk(t) 



1, fc = 0, 

v^cos^vr fc-t), k£(7 d 
k \/2sin(27rfe • t), -fee 



(19) 



where {% d )+ denotes the set of all k G Z, d \ {0} such that the first nonzero element of k is 
positive and k • t stands for the usual inner product in M. d . Since all the basis functions are 
bounded by y/2, [C3] is obviously satisfied with C3 = 2. Furthermore, if the set J\f(T) is 
symmetric, i.e., k G Af(T) implies -k G Af{T), then [C3] is fulfilled with C 3 = 1. 
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Tensor product Fourier basis We can also consider the traditional tensor product Fourier 
basis as in Ingster and Sapatinas (2009). [C3] is then obviously satisfied with C3 = 2 d . More- 
over, if the set M{T) is orthosymmetric, i.e., (k\, . . . , kd) G A/"(T) implies (±fex, . . . , ±fcd) G 
M{T), then [C3] is fulfilled with C 3 = 1. 

Haar basis Let \^j,h{')ij 6 N, k G {1, . . . ,2^}}, be the standard orthonormal Haar basis 
on [0, 1], where j is the scale parameter and k is the shift. The tensor product (tpj,k)j,k Haar 
basis is then 

d 

1=1 

where j = (ji, . . . , jd) and k = (fci, . . . , kd)- As shown in (Ingster and Sapatinas, 2009), under 
the extra assumption that the coefficients q = Cj \~ and q\ = qjj- depend only on the scale 
parameter, i.e., cj t k = Cj and qj^ = Qj, assumption [C3] is satisfied with C3 = 1. Note that 
the same holds true for the multivariate Haar basis defined in the more commonly used way 
(see Cohen (2003), chapter 2): (<^(t) = U.t =1 ^ ki (U)}, where I = (J,k,u>) such that j € N, 
k G {1, . . . , 2i} d and u» G {0, l} d \ {0} with V° fc and 

being the scaled and shifted mother 

wavelet and father wavelet, respectively. 

Compactly supported wavelet basis Since we are not limited to the case C3 = 1, any 
orthonormal wavelet basis satisfies assumption [C3], as long as the wavelets are compactly 
supported and provided that the coefficients q and qi depend on the level of the resolution 
and not on the shift. 

3.2. Examples of estimators satisfying [C6] 

We present below pilot estimators that in two different contexts satisfy assumption [C6]. 

Tensor-product Fourier basis For the first example, we assume that the orthonormal 
system {tpi} is the tensor product Fourier basis. Then we have sup^ sup tgA |vz(*)| < 2 d / 2 . The 
anisotropic Sobolev ball with radius R and smoothness a = (a\, . . . , ad) G (0, oo) d is defined 
by 

W?(R) = {/ : E, ezd Etfi^r^if} 2 < R}. 

The estimator we suggest to use is constructed as follows. We first estimate 0i[f] by 61 = 
n Sr=i x i L Pi{^i)- Then we choose a tuning parameter T = T n > and define the pilot estimator 

Hfn= E OWl- ( 20 ) 

leSp Cl <T 

To ease notation, we set N\{T) = {I G Sp : c\ < T} and A/2(T) = Sp \ J\f\{T). 
Lemma 1. Assume that either one of the following conditions is satisfied: 

• c satisfies the condition ^ q _1 < 00, 

• E C WgiR) for some R > and for some a G (0, oo) d such that a = jr)~ l > 
If T = T n ^ 00 so that |M(T)| = o(n 1 / 2 ), then flf n defined by (20) satisfies [C6]. 
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Compactly supported orthonormal wavelet basis The same method can be applied 
in the case of an orthonormal basis of compactly supported wavelets of £2(0, l] d . We suppose 
that the coefficients q = Cj k correspond to those of a Besov ball -Bf 2> *- e -> c j = 2"' s > and that 
a = s - d/A > 0. Let us set, for J G N, 

Rfn = Z~2 ke[lj2 J ]d "J,h<PJ,k where a,j. k = - x l ^j, k {t i ). 

Lemma 2. If J = J n tends to infinity so that 2 Jd = o(n), then supj g2 Ef\\Tlf — IT/ n || 4 — > 
as n — f 00. 

In the following two subsections, we apply the previous results to two examples of quadratic 
functionals involving derivatives. The orthonormal system we use is the tensor product Fourier 
basis. 



3.3. Testing partial derivatives 

We assume here that / belongs to a Sobolev class with anisotropic constraints and the 
quadratic functional Q corresponds, roughly speaking, to the squared L2-norm of a partial 
derivative. More precisely, let a £ and er € be two given vectors and define, for every 
I e £ = Z d \{0}, 



n" =1 (2vr/,) 2 ^, and q = £^(2^) 



We will assume that Ylj=i( a j / a j) < !• 

For a function / = Y,iec e m e L 2(A), we set \\f\\\ c = EieC c i 9 f and ll/lli,, = EieC^f- 
Then, for a 1-periodic function which is differentiable enough, and if the ay and o~j are integers, 
we have 

\\f\\l, q = \\d^<*if/dt?...dt a /\\l and ||/||I )C = Y! 3=l W ] f/dt?\\l 
Proposition 4. Let us define 5, a, (kj) and k by 5 = Yl'j=i a j/ a j' if = 2 <Cj=i ~t~ 



is 



3 ' J 

"t" V~ 25(1-8) an d K = Sj=i K y V ^ ^ an< ^ ® > ^1^) then the exact minimax rate r* 7 
given by r* = C*r*(l + o(l)), where the minimax rate r* and i/ie exaci separation constant 
are 

2&{1-S) _ g(l-g) 2(l+i5) t r+d 

r,*=n 4<r+d j an rf c* = [±z{ , 2 K,C{d, a, a)) (1 + 2k' 1 ) 2 < 4 -+ d > 



C(d >t r,a) = 7r- d - d n ^ r(K * 



;nti^)a-w«+2) 

Furthermore, the sequence of linear U-tests <j) n of Theorem 1 is asymptotically minimax with 

rn, 7 ~« 7 )- 2 (l + 2K- 1 ). 

Remark 5. The previous result can be used for performing dimensionality reduction through 
variable selection (Comminges and Dalalyan, 2011). Indeed, in a high-dimensional set-up it 
is of central interest to eliminate the irrelevant covariates. The coordinate ti of t is irrelevant 
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if / is constant on the line {t E A : tj = aj for all j 7^ i}, whatever the vector a E A is. This 
implies that the i th partial derivative of / is zero. Therefore, one can test the relevance of 
a variable, say t\, by comparing ||3//9ii||2 with 0. In our notation, this amounts to testing 
hypotheses (5) with Q[f] = H/H^q such that qi = (27rZi) 2 . Combining Proposition 4 and 
Theorem 1, one can easily deduce a minimax sharp-optimal test and the minimax sharp-rates 
for this variable selection problem. 

Remark 6. Another interesting particular case of the setting described in this subsection 
concerns the problem of component identification in partial linear models (Samarov et al., 
2005). We say that / obeys a partial linear model if for some small subset J of indices 
{1, . . . , d} and for a vector f3 E R'"^', one can write /(t) = g(tj) +/3 T t jc for every t E A. The 
problem of component identification in this model is to determine for an index j whether j £ J 
or not. This way of addressing this issue is to perform a test of hypothesis Q[f] = ||/||| q = 0) 
where qi = (2wlj) . Roughly speaking, this corresponds to checking whether the second order 
partial derivative of / with respect to tj is zero or not (if the null is not rejected, then 
j E J c ). Once again, Proposition 4 and Theorem 1 provide a minimax sharp-optimal test for 
this problem along with the minimax rates and exact separation constants. 

Remark 7. In the case where the covariates tj are not observable and only available, 
our model coincides with the convolution model, the for which the minimax rates of testing 
were obtained by Butucea (2007) in the one-dimensional case with simple null hypothesis. It 
would be interesting to extend our results to such a model and to get minimax rates and, if 
possible, separation constants in the multidimensional convolution model. 



3.4- Testing the relevance of a direction in a single-index model 

Recall that a single-index model is a particular case of (1) corresponding to functions / that 
can be written in the form /(t) = g(/3 t) for some univariate function g : R — > R and 
some vector f3 E R rf . Assume now that for a candidate vector (3 E R d \ {0} we wish to test 
the goodness-of-fit of the single-index model (Dalalyan et al., 2008, Ga'iffas and Lecue, 2007). 
This corresponds to testing the hypothesis 



3g : R -> R such that /(t) = g{(3 T t), Vt E A. 

_§isr^d Q.dJLf-fS — Pi 
PH 2^=1 Pj dt 3 W - m 



This condition implies that g£(t) = |Jk Eti PiJr^) = W Tv /(t), V * e {l,-..,d}, 



which in turn can be written as 

E(f-^™>)-. 

Without loss of generality, we assume that \\j3\\2 = 1 and set qi = Yli=ii^ 7r ) 2 (U ~ (/3 T 0ft) 2 = 
(27r) 2 (||Z||i - (P T l) 2 ). We consider homo geneous Sobolev smoothness classes, that is q = 
Ef=i(2vrZi) 2CT , with a > d/4. Then, when a is an integer, for a 1-periodic function which is 
smooth enough, 



d rvr c o d 



2 = v||^/ 

2 ' c ^ dtf _ 

i=i 1 i=i 



df 



and ||/||i* = £||^-AP T V/ 



2 
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To state the result providing the minimax rate and the exact constant in this problem, we 
introduce the constants 



1 

(2^Y 



Co = j^- d I . [||x||l - (/3 T x) 2 - ||x||£]* dx, 



Cl = TT\d I (Hi " (/3 Tx ) 2 ) (Nil! " (0 T *) 2 " Nl£)+<k. 
and C2 = C\ — Cq. 

Proposition 5. In the setting described above, the exact minimax rate r* is given by r* 7 = 
C*r*(l + o(l)), where 

rn = n — and Cj = ( ) " 

T/ie sequence of tests 4> n of Theorem 1 is minimax sharp-optimal if T = T„ 7 is chosen as 
T=(C;r* n )- 2 (C 1 /C 2 ). 



4. Nonpositive and nonnegative diagonal quadratic functionals 

In this section we consider the more general setting obtained by abandoning the assumption 
that all the entries qi of the array q have the same sign. That is, we still have Q[f] = ^2i<=c Qi^fi 
but now 

C + = {I : qi > 0} / and £_ = {I : qi < 0} ^ 0. (21) 

The sets J-"o an d J~i(r n ) are defined as before, cf. (5), and we use the same notation as in the 
positive case. Namely, for T > 0, we set M{T) = {l £ S F : q < T\q t \}, N(T) = \N(T)\ and 
M(T) = Y, leN{T) qf. 

We point out that, in the case considered in this section, a phenomenon of phase transition 
occurs: there is a regular case in which the rate is independent of the precise degree of 
smoothness, and an irregular case where the rate is smoothness-dependent. To be more precise, 
let I Q I denote the diagonal positive quadratic functional whose coefficients are for every 
I G C Let us recall that the minimax rate r* in testing the significance of (see Remark 

4) is determined by 

n«) 2 x M«- 2 ) 1/2 . 

In our context, this rate corresponds to the irregular case: if S contains functions that are 
not smooth enough (compared to the difficulty of the problem, that is to say if g/'s are 
"too large" compared to q's), the minimax rate corresponding to Q is the same as for |Q| 
obtained in previous sections. By contrast, in the regular case, the minimax rate is smoothness- 
independent and equals r* = ra" 1 / 4 . 

4-1. Testing procedure and upper bound on the minimax rate 

The testing procedure we use in the present context is of the same type as the one used 
for nonnegative quadratic functionals. More precisely, for a tuning parameter T n and for a 
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threshold u, we set <fi n (T) = 1|e/„(t)|>uj where the [/-statistic U n (T) is defined by 

-1/2 

l<i<j<n 



fn\~ 1/2 

U ^ T )=\ 2 ) /Z XiXjG T (ti,tj). 



with G r (ti,t 2 ) = M(r)- 1 /2^ 6Ar(r)% ^( tl )^(t 2 ). 

Theorem 3. Ze£ 7 £ (0,1) 6e a fixed significance level. Let us denote by Tq[/] f/ie linear 
functional Tq[/] = X)jg^V(T) Assume that T > is such that the assumptions 

[Dl] £/tere exists Z?i > suc/i £/iai |jV(T)| max^^T) 9? < ^1 S/gA/"(t) 9?? 

[D2] i/iere exists D 2 > suc/i i/iai sup t6A X^eA/YT) <Mt) 2 < ^2 l-A/^T) | , 

[D3] i/jere exists D3 > suc/i that supj 6S ||/||4 < D3, 

[D4] there exists D4 > suc/i £/tai supj eE ||/ • Tq[/]||2 < D4, 

are fulfilled. Set B x = 6 + 12D 1 D 2 D 2 + §D X D 2 D\ and B 2 = 4D 4 . Then, for every 
u > -fi== + r l/2 {Bi + B 2 nM(T)- 1 ) 1/2 1 

the type I error is bounded by 7/2: sup^ g j- Pf{4> n (T) = 1) < 2. 
//, in addition, 



P 2 >[u + 7 -V2 ( Bl + ^nM(T)- 1 ) 1 / 2 ] + ^ 

£/ien £/ie type // error is also bounded by 7/2: sup^g^^ Pf(4> n (T) = 0) < ^. 

j4s a consequence, if we choose u = (2M(T))- 1 / 2 (n/T) + 7~ 1 / 2 (#i + B 2 nM(T)- l ) 1/2 then 
the cumulative error rate of the test 4> n (T) is bounded by 7 /or every alternative J~i(p) such 
that p 2 > ^- l l 2 n- l {BiM(T) + B 2 n) 1/2 + 2y[2T~ x . 

This theorem provides a nonasymptotic evaluation of the cumulative error rate of the linear 
U-test based on the array w\ oc qi truncated at the level T. In the cases where the constants B\ 
and B 2 can be reliably estimated and the function M(T) admits a simple form, it is reasonable 
to choose the truncation level T by minimizing the expression 47~ 1 / 2 n~ 1 {B\M (T)+ B 2 n) 1 ^ 2 + 
2\[2T~ l . By choosing T in such a way, we try to enlarge the set of alternatives for which the 
cumulative error rate stays below the prescribed level 7. Therefore, the last theorem implies 
the following non-asymptotic upper bound on the minimax rate of separation: 

<<^a( 4(Bi T:r 2 " )i/2 +¥)- 

This non-asymptotic bound clearly shows the presence of two asymptotic regimes. The first 
one corresponds to the case where n is much larger than M(T*), whereas the second regime 
corresponds to n = o(M(T*)). Here, T* is the minimizer of the bound on p 2 obtained in 
the theorem above. The next corollary exhibits the rates of separation in these two different 
regimes. 
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Corollary 1. Assume that the arrays q and c are such that M{aT) xy^^ M(T) for every 
a > 0. Let T® be any sequence of positive numbers satisfying T®^M(T®) x n. If for the 
sequence T n = T® A n 1 / 2 all the assumptions of Theorem 3 are satisfied, then for some C > 
the linear U-test 4> n (T) based on the threshold T = T n satisfies 

ln {F^F 1 {CT- l ' 2 ),<p n )< 1 . 

Thus, the rate of convergence is r* = (T^)" 1 / 2 if T® = o(n 1 / 2 ) and r* = n -1 / 4 otherwise. 

Remark 8. Condition [D4] of Theorem 3 is more obscure than the other assumptions of 
theorem. Clearly, it imposes additional smoothness constraints on the function /. Using the 
Cauchy-Schwarz inequality, one can easily check that either one of the assumptions [D4-1] 
and [D4-2] below is sufficient for [D4]: 

[D4-1] For some constants D§ and Dq, supy gS \\f\\oo < D5 and max;^^ \qi/ci\ < De- 
[D4-2] For some constant D' A , supj gS ||Tq[/]||4 < -D4. 

4.2. Lower bound on the minimax rate 

We will show in this subsection that the asymptotic rate of separation provided by Corollary 1 
is unimprovable, in the sense that there is no testing procedure having a faster separation 
rate. To this end, for every a E {— , +} we set M a {T) = X^e£ a rW(T) 9?j N a (T) = \£ a nftf(T)\, 

M*(T) = M+(T) V M_(T), N*(T) = N + (T)1 {M+{T)>M _ {T)} + iV-(T)l {M+(T) < M _ (T)} . 

Theorem 4. Let us consider the problem of testing Hq : / £ Tq against H\ : / £ T\{p), 
where J-$ and T\ are defined by (5) and 

ZL = {f = Y, leC e M^r- I^ ejC ^[/] 2 <l> ||/||4V||/-t q [/]|| 2 <l}. 

Assume that the sets C+ and C- defined by (21) are both nonempty and that £j 's are Gaussian. 
The following assertions are true. 

1. For every 7 < 1/4 there exists C > such that liminf n _ s . 00 7n(J 7 o, ^(Cn -1 / 4 )) > 7. 

2. Let T® be a sequence of reals such that 4T^^/ M(T®) > nz^^, 2 as n — > 00. If the 
assumptions [Dl] (cf. Theorem 3) and 

[D5] AT*(T°) -4ooso that N*(T%) log N*(T°) = o(n), 

[D6] there exists Dq > such that sup tgA YlieAf*(T°) Pii^) 2 — D§N* (T®) , 

are fulfilled, then there exists C > such that lim infj^oo 7 n (-7~o> 3~\ (C(T®) ^ 2 )) > 7- 

Corollary 2. Combining the two assertions of this theorem, we get that the minimax rate of 
separation r* is lower bounded by n -1 / 4 V (T^)^ 1 / 2 = (n 1 / 2 A T°) -1 / 2 = T n 1 . Thus, if the 
conditions of Theorems 3 and 4 « r e satisfied, then the minimax rate of separation is given by 
r* = T n 1 , where T n = n 1 / 2 A T® and T® is determined from the relation T^M(T^) 1 / 2 >c n. 
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4-3. Testing equality of norms 

As an application of the testing methodology developed in this section, we consider the 
problem of testing the equality of norms of two functions observed in noisy environment. 
More precisely, let us consider the following two-sample problem: for i = l,...,nwe observe 
(a?i,i,ti,i) and (x 2 ,i,t 2) i) such that 

•&s,i — 9s\^s,i) £s,i> i — 1, . . . , Tl, S — 1)2, 

where t a j's are independent random vectors drawn from the uniform distribution over [0, l] d . 
Furthermore, we assume that £ s ,i's are i.i.d. such that E(£ St i\{t S j}) = 0, j|{t s j}) = 1 
and, for some C% < 00, E(£% J{t S) j}) < C% almost surely. 

Assuming that both g\ and g 2 belong to a smoothness class S, we wish to test the hypothesis 
#0 : H31II2 = \\92h, against iJ x : | \\gi_ || 2 - \\g 2 1||| > (? . 

It can be useful to perform such a test prior to using a shifted curve model in the context of 
curve registration (Dalalyan and Collier, 2012, Collier, 2012). Indeed, if there exists r 6 [0, l] d 
such that 51 (t) = 52 (t — t) for every t G [0, l] d and the function g\ is one-periodic, then 
necessarily H51H2 = Hfi^lh- Thus, the rejection of the null hypothesis implies the inadequacy 
of the shifted curve model. In order to show how this type of test can be derived from the 
framework presented in the previous subsections, let us consider the case of a Sobolev ellipsoid 
E. 

Let {^m}i&M be an orthonormal basis of the subspace L 2 ^ c ([0, l] d ) of L 2 ([0, l] d ) consisting of 
all the functions orthogonal to the constant function. We will assume that both g\ and g 2 are 
centered (this implies that they are orthogonal to the constant function as well). The Fourier 
coefficients of a function g w.r.t. a basis {i/j m } will be denoted by 0m[g]. We assume that for 
some array c and some constant L > it holds that 

g s £X° L = {geL 2 , c ([0,l} d ):Y, m&M c m 9i[g} 2 <1, \\g\k<L}, Vs6{l,2}. 
Assume now that we wish to test 
Ho ■ 9m&t\gi] 2 = Yl QrnOtfo} 2 , against H x : £ q m (€[9l? ~ 



where q = {q m } is a given array. In order to show that this problem can be solved within the 
framework of the previous subsections, we introduce the functional set 

S L = {/ : [0, l] 2d -> M : f{h, . . . ,t 2d ) =g 1 (t 1 ,.. . , t d ) + g 2 (t d+1 , ...,t 2d ) with gi,g 2 € 

Setting C = M X {1, 2} and for l = (m,s)eMx {1, 2} 

Pl(ti,t 2 )=^ m (t a ), for all t = (t 1 ,t 2 )G[0,l] d x [0, l] d , 



we get an orthonormal basis of S^. Clearly, for a function / G we have 0f [/] = #m,s[/] = 
dti[g s ]- This implies that Y> L is included in the set Y, 2 L = {/ : £]( m , s ) c m 0m,s[f} 2 < 2 ; ||/|| 4 < 
2L} and contains the set = {/ : X^( m s) c m6m,s [f? < 1, || /Ik < £}• Therefore, for studying 
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the rate of separation of a testing procedure we can assume that / G £?, whereas for estab- 
lishing lower bounds on the minimax rate of separation we can use the relation C S^. In 
both cases, this perfectly matches the framework of the previous subsections. 
We give a concrete example by setting A4 = Z, d and choosing as {ip m } the Fourier basis 
in dimension d. Similarly to the example in Subsection 3.3, we focus on anisotropic Sobolev 
smoothness classes defined via coefficients 

d 

c m = Y J (^rn i ) 2a \ m€Z d , 
i=l 

for some rr = Oi,...,tf d ) G R+. As it was done previously, a stands for the harmonic mean 
of crj's: a = (| ^iLi ^ 1 ) • To test the equality of norms, we introduce the coefficients qi, 
I = (m, s) G Z d x {1, 2}, of the quadratic functional Q: 

qm,s = (-l) s , (m,s) G Z d x {1,2}. 
Theorems 3 and 4, as well as the computations done in the proof of Proposition 4, imply that 

2& A 1 

the minimax rate of separation in the problem described above is: r* = n i "+ d 4 . This rate 
shows that the watershed between the two regimes corresponds to the condition a = d/A. 
In other terms, we are in the regular regime when a > d/A. It is interesting to note, even if 
we are unable to establish a direct connection, that this is also the regime under which the 
Sobolev embedding W% C ^([O, l] d ) holds true. 

Appendix A: Proofs of results stated in Section 2 
A.l. Proof of Proposition 1 



Throughout the proof, the terms o(l), 0(1) and the equivalences are uniform over E. Let 

J f 



£(w„) be the support of w n . EV 2 will denote the conditional expectation with respect to T>2- 



M/.w.H^V'V _^?[/], (23) 



We define 



2 / ^ze£(w„) 
G n (ti,t 2 )= w l,n<Plfa)<Plfa)- ( 24 ) 

This allows us to rewrite the U-statistic U n in the form U n = U n> o + U n> i + U n> 2 where 

/ 2 x 1 / 2 ^-^ 

U n> k = — -, rr > v K nk (zi,Zj), k = 0,1,2, 

\m(m — 1) / ^-^ 

l<i<j<rn 

are U-statistics with the kernels 

#„,o(zi,z 2 ) =66G„(ti ) t 2 ), (25) 
i^ n , 1 (z 1 ,z 2 )= [ei(/-n/ n )(t 2 ) + e 2 (/-n/ n )(ti)]G ! „(t 1 ,t 2 ) > (26) 

^2(il,Z2) = (/-n/n)(tl)(/-n?n)(t2)G«(ti,t 2 ). (27) 
To prove Proposition 1 and the subsequent results, we need two auxiliary lemmas. 
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Lemma 3. Let w n = (wi :Jl )i e c be a family of positive numbers containing only a finite number 
of nonzero entries and such that YlieC w in = 1- £( w n) be the support ofw n . Then the 
expectation of the U- statistic U n is given by: 

Ef[U n ] = E f [U n , 2 ] = h n [f,w n ], 

whereas for the variances it holds 

Ef[Ul } = 1, 

Efpl,] < 2||w„||^(sup5]; fejC(wTi) ^(t)) (||n Sj ,/||| + ^/[||n/ - n/„||l]), (28) 

^ar / [[/ n , 2 ]<8||w n ||L(sup £ <p 2 (t)) (\\U SF f\\i + E f [\\Uf - UfJi 

teA /e£(w. 



+ 8/i n [/,w n ]||w n || 00 (sup £ <Pi(tj) (||n SF /||! + ^ / [||n/-n/ n 

teA *e£(w n ) 



ill) 



(29) 



Proof. It is clear that EfU n ^ = EfU n ^\ = 0, while 

Ef[U n , 2 ] = (^^) 1/2 E / [if n , 2 (z 1 ,z 2 )] 

with 

E f [K n , 2 (z 1 ,^)]=E f [Y, leC(yfn) wi, n ( J (/(t)-n? n (t))^(t)dt)' 
As II/ n G span({(/?;}/ g 5c ) , we have fHf n (pi = for all Z G S^r. Therefore 

Ef[U n d = ( ! ^ =Lll ) 1/2 E ie£(w „ ) ^^ 2 [/] = ^n[/,W n ]. 

Now, let us evaluate the variances. Since £jS are non correlated zero-mean random variables 
with variance one, and ipi's are orthonormal, it holds that Ef[U^ ] = Ef[G n (t\, t 2 ) 2 ] = 
w f,n = 1 - For u n,i, w e have 

Var^i] = Efpl,} = E f E? [K^fr, z 2 )]. 
Using the definition of G„(ti,t 2 ), we get 

^n^,i(zi,z 2 )]=2^(/-IT? ri ) 2 (t 1 )G2(t 1 ,t 2 )cZt 1 dt 2 

= 2 / (/-n/ n ) 2 (t!) £ ^>2 (ti)dti 

7A ie£(w„) 
< 2( max ™ 2 ) (sup V ^(t)) ||/ - n/ n || 2 . 

Then, the Pythagoras theorem yields 

EfWf - n/nlll = 11/ - n/|| 2 + E f \\uf - ufjl = \\u SF f\\l + E f \\uf - n? n || 2 . 
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This completes the proof (28). As for the variance of U n2 , we have 

Var f [U n>2 ] = E f Ef[Ul 2 ) - (E f [U rh2 \) 2 = A nA + A n>2 + A Uj3 , 

where 

A n>1 = E f J! (f -fTf n )\ti)(f -fTfrfi^Glit^dtidtz, 

An ' 2 = m(m-1) {t) Ef W U -KfSit^f - nf n )(t 2 )G n (t u t 2 ) 

x (/ -nf n )(t 3 )G n (t h t 3 )dtidt 2 dt 3 , 

and 

^n,3= m(m 4 _ 1} // /(tl)/(t a )G„(ti,t 3 )dti(it a } a - (E f U n , 2 f. 

Let us bound the first term A nt i: 

A n ,l = Ef w l,nWV,n{ /(/-n? n ) 2 (t)^(t)w(t)dt) 2 . 

l,l'EC(w n ) 

Now, in view of Bessel's inequality, 

< max wl \ n E f>B £ /(/ - n/„) 4 (t)^ 2 (t) dt 

* (^r,*) (sE,^ rfwWii/ - > 

and the expression inside the last expectation can be bounded using the inequality ||/ — 

n/ n ||^<8(||n 5F /||t + ||n/-n/ n ||f). 

The term A n 2 can be dealt with similarly. Using the Cauchy-Schwarz inequality, 



in,2 



m(m — 1) y 3 

) 



< 

' 2 



< ( max wi 

■ ieC(w n ) 



J2 wi,nwi>, n ei[m,[f]E f { f (/-n/ n ) a (t) w (t)w(t)dt} 

!'SC(w n ) 

£«?„*l[/] 2 )( £ { / ^/[(/-n/n) 2 (t)]^(t)w(t)dt} 2 ) 1/2 

J M'&C(w n ) 

n)K[f,-Wn}( £ £ /{ / (/ " ^fn) \t)<Pl (*) W (*) * }' 



2\ 1/2 



'e£( 

By virtue of the Bessel inequality, it holds that 

J E f l{f-Llf n )-(t)\tf(t)dtj [ ~ 



A n , 2 < ( ™ W , ift )fc n [/,w n ]( £ y £/[(/-n/J 4 (t)]<^(t)dt) 



< ( max to,,„)/i„[/,w n ](sup V ^(t)cft) 1/2 (%[||/-n/ n ||t]) 1/2 . 
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The last expectation can be bounded in the same way as we did several lines above for the 
term A n ^. The last term A n ^ is actually negative 

' 6 m(m - 1) \4 J V^ie£(w„) l ' n 1 ) 2 V^/e£(w n ) l ' n 1 J ~ 

Combining all these estimates, we get (29). □ 

Lemma 4. Let w n = {w^ n )i^c be a family of positive numbers containing only a finite number 
of nonzero entries and such that X]zg£ w f n = Assume that the random variable £i has finite 
fourth moment: Ef[£f\ < oo. //, as n — )■ oo, 



|w n || 00 = o(l) and ||w n ||^ ( sup V . ^z(t) 2 ) = o(n), (30) 



then U n o is asymptotically Gaussian N (0,1). 

Proof. This result is an immediate consequence of (Hall, 1984, Theorem 1). □ 

With these tools at hand, we are now in a position to establish the asymptotic normality 
of the U-statistic U n which leads to an evaluation of the type I error of the U-test. Let us 
recall that, for / G Fq, it holds Q[f] = J2Ql@l[f] 2 = an d, therefore, 9\[f\ = for all 
I € Sf = {I '■ Qi 7^ 0}. Hence, for every / G Tq, h n [f,w n ] = and Hs F f = 0. So, it follows 
from Lemma 3 that under the assumptions of the proposition, the convergences Ef[U% 1 ] — > 
and Ef[U^ 2 ] ~^ hold true uniformly in / G Fq. This implies that t/ n i and U n ^ tend to 
zero in Pj-probability, uniformly in / G J-q. On the other hand, according to Lemma 4, 
U n fi — > A^(0, 1) in distribution. The claim of the proposition follows from Slutsky's lemma. 



A. 2. Proof of Proposition 2 

We first note that for every h > it holds 

sup P f (U n <u)=( sup P f (U n <u))\/( sup P f (U n <u)). (31) 

h n [f,w n ]>h h n [f,MV„]<h 

The value of h will be made precise later in the proof. Assume merely by now that h > 2(l+u). 
Then, 

SUp Pf{U n < U) < SUp 2 = SU P 2' 

/e^i(p);_ /es ; fc„[/,w n ]>ft (-E/I^n] - /e£;M/>„]>h w n ] - u) 

h n [f,w n ]>h 

Using the conditions of the proposition and the inequalities of Lemma 3, we get that for some 
constants C, C independent of h, 

V (TT s \ s C(l + h n [f,w n }) 1 + h ,- x 

sup Pf{U n < u) < sup 2 — 2 — ^ • v^2) 

feTi(p) _ /es ; fc„[/,w„]>ii (h n [f, w n ] - u) (h - u) 

hn[f,Wn]>h 
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Let us switch to the second sup in (31). Let S n > be a sequence tending to zero. One readily 
checks that 

P f (U n <u) = P f (K[f,w n ] + U nfl + U nA + (U n>2 - h n [f,w n ]) < u) 

< P f {h n [f,w n ] + U n>0 <u + 6n) + Pf(-U n>1 - {U nj2 -h n [f,w n ]) > 8 n ) 
sjp , h r f 1 , , , , 2Var f (U n , 1 ) + 2Var f (U n , 2 ) 

< F U ,n( U ~ M/> w n] + S n ) + 1 ' ( 33 ) 

where Fjj Qn (-) is the c.d.f. of f7o,n- On the one hand, we know from Lemma 4 that f7 nj o 
converges in distribution to A/"(0, 1). This entails that Fjj n converges uniformly over R to 
Therefore, 

Fc/ _ B (u - h n [f, w ft ] + 5„) = *(u - /i„[/, w n ] + <5 ft ) + o(l) = *(« - h n [f, w n ]) + o(l) + S n O(l). 

On the other hand, in view of Lemma 3, Var f(U n< i) + Varf{U ni 2) = 0(||ns F /||| + 1 1 XT^g-^ ^ 1 1 2 ) - 
Then we have, 

iiiwiii=£^f E w ^f + £ * 2 

V2h n [f,w n ] _! 

- tt^ — rr + sup c « • 

Applying Holder's inequality we get ||IIs F /||| < ||n SF /||2 (p " 4)/{p " 2) ||n 5F /||p P/(p " 2) . There- 
fore, we have 

sup P f (U n <u)< sup $( u -fc n [/,w n ]) + o(l) + <$ n O(l) + -^ ^ — -A 

/ln[/,W„]</l 

Choosing /i large enough and then making 5 n tend to zero sufficiently slowly we get the desired 
result. 



A . 3. Proof of Proposition 3 

Using Kneser's minimax theorem for bilinear forms (Kneser, 1952), we can interchange the 
sup and the inf as follows: 

sup inf (w,v) = inf sup (w,v) = inf || v l|2> (34) 

||w|| 2 = l <v,c)<l,(v,q}>p 2 <v,c)<l,(v,q}>p 2 ||w|| 2 = l <v,c)<l,(v,q}>p 2 

Furthermore, the array w* attaining the sup is given by wf = vi/\\v\\2- Now, the minimization 
at the right-hand side of (34) involves a convex second-order cost function 1 1 v 1 1 2 and linear 
constraints vi > 0, (v, c) < 1 and (v, q) < p 2 . Therefore, according to KKT conditions, if there 
exist fj,, A > and v £ R£ satisfying for some v* 6 R£ the conditions 2v* + Ac — /xq — v = 
and A((v*, c) — 1) = 0, /u((v*, q) — p 2 ) = and V[v\ = for all I, then v* is a solution to the 
minimization problem (34). Under the conditions of the proposition, one easily checks that 
these KKT conditions are fulfilled with A = 2/ ^ ci(T p qi — q) + , \i = 2T p / J2i c-i{T p qi — q)+ 
and vi = 2(q - T p q t ) + / ^ Q {T p qi - q)+. 
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A. 4- Proof of Theorem 1 

To ease notation, we set A/" n , 7 = A/"(T„ )7 ). We first check that under the assumptions of the 
theorem all the conditions required in Propositions 1 and 2 are fulfilled. Since ||w* ||o = |A/" nj7 | 
and llw*!! 2 ^ < max/ e jv n i qf/ Y2ieAf n (qi — j?— ) 2 > condition [CI] implies the first condition 
of Proposition 1. Conditions [C3] and [C4] imply respectively the third and the second con- 
ditions of Proposition 1. Finally condition [C6] implies the fourth condition of Proposition 1. 
Thus, we have checked that under the conditions of the theorem, the claim of Proposition 1 
holds true. To check that the claim of Proposition 2 holds true as well, it suffices to check the 
first assumption of that proposition (the second one being identical to [C7]). In fact, it is not 
difficult to check that the first assumption of Proposition 2 follows from [C2], [C4] and [C5] 
for the sequence ( 2 = min JeA /-„ i7 qf /^Y,ieM nn <!?■ 
Therefore, combining the results of Proposition 1 and 2, we get that 

7 n (J C 0,J C 'l« 7 ),C) < $(-*i_ 7 / 2 ) + $(*l- 7 / 2 - inf /6^i(r*, 7 ) fcn[/>W*]) + o(l). (35) 

In view of Proposition 3, the infimum over / of h n [f, w*] can be evaluated as follows: 

inf /eJi(r« )^n[/,w*J = ( 5 ) inf 2^wl Bf 

E ; ^ 2 >K 7 ) 2 

1/2 



m(m — 


1) 


2 




m(m — 


1) 


2 




m(m — 


1) 


2 




m(m — 


1) 


2 



inf <w*,v> 

veM^:(v,c)<l 
(v,q)>(r* ) 2 



1/2 

llv* 



2\l/2 



DleA/-», 7 c K T n, 7 « - c l) 
Inserting this expression in (35) and using (15), we get that 

7n (J0,^l« i7 ),C) < <I>(-^l_ 7 / 2 ) + $(zi_ 7/2 - 2z!_ 7/2 + 0(1)) +0(1) 

= 2$(-z 1 _ 7/2 ) + o(l) =7 + 0(1). 



A . 5. Proof of Theorem 2 

The proof of the lower bound follows the steps of (Ingster and Sapatinas, 2009). However, we 
considerably modified the way some of these steps are carried out which allowed us to relax 
several assumptions and resulted in a shorter proof. 

Let us recall that 0[f] = (9i[f])i^c £ ^2(>C) is the array of Fourier coefficients of a function 
in L 2 (A) w.r.t. the system (fi)iec- We introduce the sets @i(p) = {0 G £ 2 (£) : {c,9 2 ) < 
1, (q,# 2 > > P 2 } and 9 = {0 G 1(C) : (c,0 2 ) < 1, (q,0 2 ) = 0}, where we used the notation 
2 = {Of}ieC- Clearly, if / belongs to the functional class Ti(p) (resp. Fq) then 9[f] G @i(p) 
(resp. 0[f] G 6 ). 

Let C < 1 be a constant. Our goal is to prove that j n (Fo, Fi(Cr*)) > 7 + o(l). To get 
this lower bound, we define prior measures that are essentially concentrated on the sets Oq 
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and 0i. Let tx\ and 7r 2 be measures on the space such that 7r^(0o) = 1 + o(l) and 

7r^(0i(Cr* )) = 1 + o(l). Those priors lead to the corresponding mixtures: 

P^(A) = J P (A)n: i n (d9) for every measurable set A C (A x M) n , i = 1,2. 

If TnC-FVi)^) = mr V:(AxR)«-Kl,2} {- P tti(V ; = 2) + P,^ = 1)} is the minimal total error 
probability for testing the simple null hypothesis Hq : P = P w i against the simple alternative 
H\ : P = P n 2, then we have (see Proposition 2.11 in Ingster and Suslina (2003)) 

7 „(JVi(C< 7 )) > 7 „(P 7r i,P 7r ,) + (l). 

As shows the next result, to get the desired lower bound, it suffices to show that the Bayesian 
log-likelihood log(dP 7r 2 /dP n i ) is asymptotically equivalent to a Gaussian log-likelihood. 

Lemma 5 (section 4.3.1 in Ingster and Suslina (2003)). If there exists a deterministic se- 
quence u n and a sequence of random variables r\ n such that under P w i -probability r\ n converges 
in distribution to M(0, 1) and 

u 2 

log(dP 7r 2/dP 7r i) = U n r, n - -f + Op(1), (36) 
then 7n (P n i,P w 2) > 2$(-u n /2) + o(l). 

For our purposes, we choose 7r„ to be the Dirac measure in and denote the corresponding 
mixture probability P v i by Pq. It is clear that with this choice 7r^(©o) = 1. We now explain 
how 7r^, that we will call 7r n from now on, is built. Let a„ € Mfp be an array containing a 
finite number of nonzero elements. Let £(a n ) be the support of a„, i.e., ai ^ if and only if 
I € £(a„). We assume that £(a n ) C Sf and define Tv n {dO) as the Gaussian product measure 
such that under 7r n the entries Q\ are independent Gaussian with zero mean and variance a\. 

Proposition 6. Let 5 G (0, 1) be such that 1 — 8 > C . Assume that a n = (1 — 5)v n and, as 
n — y oo, the following assumptions are fulfilled: 

[LI] (c,v„) < 1 and (q,v n ) > « j7 ) 2 , 

[L2] max j6jC(Vn) (^u/) = o((q, v)) and max /e£(Vn) (q^) = o((c,v», 
[L3] ||v n || -> oo and n||v n ||^ C) ||v n ||§ log ||v n || ->■ 0, 

1 /3 

[L4] nllv^Hoollvnllo 7 -> and ||v n || 3 = o(||v n || 2 ). 

[L5] For some L5 > 0, ZioWs YlieC(a n ) 00 — -^5ll a nllo- 

Then, as n — )• 00, 

7n (.Fo,.Fi(C< 7 )) > 2 $(- ^^||v n || 2 ) + (1). (37) 

Proof. The proof of this proposition will be carried out with the help of several lemmas. The 
fact that 7r n (0i(Cr* 7 )) = 1 + o(l) is proved in the following lemma. 

Lemma 6. Assume that a ra = (1 — 5)v n satisfies [LI] and [L2]. Then, for every 5 6 (0, 1), 
if holds that 7T„(0i(Cr*)) = 1 + o(l). 
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Proof. Let us denote Hi(0) = YlieC 1$l and ^(0) = Yliec c l°l- In 

view of [LI], we have 

J n 1 (e)7r n (de) = J2mai > « 7 ) 2 (i- 5), J u 2 (o)n n {de) = J2 c i°i < 1 ~ s - 



On the other hand, since the variance of the sum of independent random variables equals the 
sum of the variances of these random variables, we get 



Ui(9) 2 ir n {d0) - / U^Ofit^dQ)) = 2 Y^qfaf < 2(q,a n ) max {q m ) 



leC 



By Tchebychev's inequality, we arrive at 



fa v fa\ / (n * ^2^ ^ 2max; gz:(Vn )(g m ) 



7r n (0:"H 2 (0)>l) < 



2max; e£(Vn) (q^) 



5 2 {c,v ri 

The claim of the lemma follows now from condition [L2]. 



□ 



Second, we show that for every p > 2 and every L > 0, the probability Tr n (0 : \\ J2i fyflllp > ^) 
tends to zero. Indeed, in view of the Tchebychev inequality and Fubini's theorem, 



--HIE, 



Z>w(t) 



(it. 



Using the fact that for every fixed t, the random variable ^2i0i(pi(t) is Gaussian with zero 
mean and variance Yli a l i f'i(^)i we § e t 



>L) <p\L- p 



p/2 



dt^plL^L^Ha^loolla^lor/ 2 . 



The last expression tends to zero as n — >• oo in view of condition [L3]. 

We focus now on the proof of (36). Set m = \£(a n )\ and let & n be the mxn matrix having as 
generic element (& n )u = Plfti)- Let A n be m x m diagonal matrix having the nonzero entries 
of a n on its main diagonal. It is clear that under Pir n , conditionally to T n , x — ■ ■ ■ > %n) 
is distributed according to a multivariate Gaussian distribution with zero mean and n x n 
covariance matrix R n = <i?jA n <3? n + I n . Therefore, the logarithm of its density w.r.t. Pq is given 

by 



log 



dP 



x;ti, . . . ,t n /) = -i(logdetR„ + x T (R n 1 - I n )x). 



In what follows, we denote by |||M||| = supi| x |i 2=1 1 1 Mx 1 1 2 the spectral norm of a matrix M. 

Lemma 7. Let R n = nk n + I m and m = m n -)• 00. If ra 2 ||a n || 2 X) ||a ri ||o|||^ < I , n < I , I - Im||| 2 
op(l) and I TrfR^B^I +£[|£ T R^ 1 B n R~ 1 £|] = o P (l), then under P Q it holds log (dP nn /dP ) 
-i(logdet R„, + ^(R; 1 - I m )£) + o P (l), w/iere £ ~ 7V m (0, I m ). 
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~ 1/2 t 1/2 — 

Proof. Let us denote R n = A„ A n + I m , B n = R n — R n and introduce the function 

g(z) = logdet(R n + zB n ) for z G [0,1]. One easily checks that g(l) = log det R n = logdetR n , 

g(0) = logdetR n and g'(z) = Tr[(R„ + zB n y l B n }. Therefore, the relation g(l) - g(0) = g'(z) 

for some z G [0, 1] implies 

| logdetR n - logdetR n | = | Tr[(R n + zB n )' 1 B n ]| 

< iTrlR;^]! +m|||(R n + zB n )- 1 -R-^llBnl. 

Using the identity (R n + zB n ) _1 — R" 1 = — z(R n + ■zB n ) _1 B n R n " 1 , we get 

| logdetR n -logdetR n | < | TrfR^^JI + m|||(R n + zBn)- 1 !!^ 1 !!^,,! 2 

< iTrfR^^n]] +m|||B„||| 2 , 

where we used that R„ and R n + zB n = I m + zAl/ 2 ^ n ^^Al/ 2 + (1 — z)nk n have all their 
eigenvalues > 1. On the other hand, one can check that |||B„||| < ra|||A n ||||||^<I><I> T — I m |||. Com- 
bining these inequalities with the facts |||A n ||| = Halloo and m = ||a„||o — > oo we arrive at 
log det R„ = log det R n + op(l). 

The term x T R~ 1 x is dealt with similarly. First, using the singular values decomposition of the 

l li 

matrix A n one can note that for an appropriately chosen vector £ ~ A/" m (0, I m ), it holds 
that x T (R r ^ 1 — I„)x = £ T (R„ 1 — I m )£- Then, we introduce the function g{z) = £ T [R n + .2B„] -1 £, 
the derivative of which is given by g'(z) = — £ T (R n + zB n )~ 1 B n (R n + zB„) -1 ^. Therefore, for 
some z G [0, 1], 

IC T Rn^ " ^K^l = l£ T (*n + zB n )" 1 Bn(Rn + SB*)"^ 

< I^R-^^^I + |^ T [(Rn + ^Bn)" 1 - Rn] _1 Bn(Rn + zEn)^^ 

+ \£ T [fan + zB n ) 1 — R n ] 1 B n R n £| 

< l^R^BnR-n 1 ^ + 2||^||2|Bn| 2 . 

It is well-known that H^H?, being distributed according to the Xm distribution is Op(m), as 
m — > oo. This completes the proof of the lemma. □ 

According to (Vershynin, 2012, Cor. 5.52), under [C3], we have |£Sn$!-Im|| < C(^^i) 1 /2 
with probability at least 1 — 1/n. Furthermore, using the facts that the R n is a diagonal matrix 
with diagonal entries > 1 and that the variance of the sum of independent random variables 
equals the sum of variances, one readily checks that E\ Tr[R~ 1 Bn]| 2 + -E[|^ T Rn 1 B n R n ^ 1 ^| 2 ] < 
3Cg ?7,|| || ^ || ||§ . Hence, condition [L3] implies that the two conditions of the last lemma 
are fulfilled and, therefore, its claim holds true. Using the fact that A„ is diagonal, we get 

log (dP*JdP ) = lT f^$- - log(na, + 1)) + o P (l) 



2 ^ \nai + 1 



Lemma 8. Let us denote 

n||a„|| 2 ^ j_sr^ na i(Cf 



^2 ' ' u n ^ 2(n ai + 1) 
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If the conditions TflTl 1 1 Sin | loo — ^ 0? (XTldb ||^-n 

|| 3 = o(||a n || 2 ) are fulfilled, then rj n converges in 

distribution to A/"(0, 1) and 

1 y _ N y natf - 1) = _ u£ 

2^Vna/ + l v V ^2(^ + 1) ' 2 vy v ; 



Proof. Since nllaHoo — > 0, we have ^""^ = nai — (nai) 2 + 0((na;) 3 ) and log(raa/ + 1) = 

na i -^£+0((na i ) 3 )- This implies that E;e£ (^T-log(^ + l)) = -^+0(mn 3 \\a n \\l ). 
On the other hand, using the central limit theorem for triangular arrays, we get the weak 
convergence of r) n to jV(0, 1) provided that u~ 3 Ez( na 3 / (nai + I) 3 tends to zero. Since 
under the conditions of the lemma this convergence trivially holds, we get the claim of the 
lemma. □ 

Combining Lemma 5 with (38) and (39), we get (37) and the proposition follows. □ 

To complete the proof of Theorem 2, we shall show now that if we choose T n ^ as in Theorem 1 
and define v n by 

_ _ (Tn^qi - q)+ 

v l v l,n v-^ frn \ ' 

then all the conditions of Proposition 6 are fulfilled. We start by noting that [LI] is straight- 
forward. To check the first relation in [L2], we use [CI] and \M(T n ^)\ — > oo, along with the 
following evaluations: 

v/ e M{T na ), ™ = «(rn, 7 fl-cO < <L_ < a 



<q, v) EiQiFn^i - ci)+ ~ EM - ~ W(T n , 

For the second relation in [L2], in view of (15), V/ £ N(T nri ) we have 
cm _ Ci(T nn qi-ci) T n ^ciqi 



T na c iqi 0(1) max feAr(Tn 7) q 

The last term tends to zero due to [C9] . From the definition of v n , equation (15) and condition 
[CI] one can deduce that 

_ max;(r re;7 g; - q) + T n[7 max;gf 

Ejq(T„, 7 <R-q) + n(E/( T n )7 % -q)^-) 7 

max;g; _ O(l) 

- n|JV*(T nj7 )|i/2max, 9 , K) n\N{T nn )\V^ 

|2 II,, ||2 



This inequality yields Hl v nllooll v nllo = 0(\M(T n ^)\/n). Therefore, [L3] follows from [C8]. 

^HoollVnllo 73 = 0(\J\f(T n>1 ) 



Finally, to check that [L4] is true, we notice that re||v n || 00 ||v n ||Q //3 = 0{\M{T n ^)\ 3 2) = 0(1) 



and 

ll v n.|li _ Ei( T n,7* - c + < m ax; qi C{ /2 



'n\\2 
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Thus, all the conditions of Proposition 6 are fulfilled and, therefore, 

7n (Jb,^i(C< 7 )) > 2*(- ^^||v n || 2 ) +o(l). 

Since this equation is true for every 5 E (0, 1 — C), it is also true for 5 = 0, and the claim of 
Theorem 2 follows from (15). 



Appendix B: Proofs of lemmas and propositions of Section 3 



B.l. Proof of Lemma 1 



Let us write Yif = H\f + II2/, where LTi and r±2 are the orthogonal projectors in L 2 (A) onto 
the subspaces spanji^/ : I 6 A/"i(T)} and span{(^ : / G A/2(T)}, respectively. We first assume 
that the inequality ^c z -1 < 00 is fulfilled. 
On the one hand, using the Cauchy-Schwarz inequality, 

n*/ni = / A (E^ 2(T) oAf^dt < 2-(e^ 2(t) m\ 

<2 2d (y cMffYlY c7 i ) 2 <2 2d (y c,- 1 ' 2 

On the other hand, 



|n 1 /-n/j|t= ! ( E (^-^[/])^(t)) dt 

= / (-E E (^(ti)-^m)^(t)) ^. 

• /A \ n i=i /e^m ' 



leMi(T) 

Using Fubini's theorem and Rosenthal's inequality, for some constant C > 0, we get 

I {f>/f E {xm^-em)^)) ) dt 

1 i=i V /eM(T) 7 J 



C 

+ 



By Holder's inequality, we get 



A/ ( E {*m(U)-ei\f\)<pi(t)) <IM(r)| 3 E %(w(ti)-^[/])Vi(t) 

* lr- \T_ fir's ' 1r- A/1 /T^ 



■l£jVi{T) ' leAfi(T) 

<2 2d \M 1 (T)\ 3 E £/(/(tiVi(ti)+&pKti)-0l[/] 



0(|M(T)| 4 ), 
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where we used the fact that E[^] < oo and that E[f{ti) A ] < 2 m {Y j1 c,) 2 < oo under the 
conditions of the lemma. Similar arguments lead to 

2 \ 2 



/ (E £ /( E (^(t,)-^[/])^(t)) }dt = 0(n 2 |M( 
Ja < i=i ^ieM(T) ' > 



which implies that Ef\\Hif — Iif n \\f = 0(|A/i(T)| 4 /n 2 ). Combining the obtained evaluations, 
we get 



E f \\Uf- Uf Ji< 



Z: Ci >T 



The required consistency follows from the assumption |jV(T n )| = o(n 1 / 2 ). 
Let us consider the case £ C W^i?). Without loss of generality, we will assume that S = 
(i?) and q = X^LiG^i) 20 '/-^ 2 - The computations remain the same as in the previous case 
but the term HII2/H4 is bounded using Sobolev inequality (Kolyada, 1993). Indeed, choosing 
<x' so that a\ = (1 — t)g{ and t < 1 — d/ (4<r) (this implies that a' > d/4), we get 

d 



|n 2 /|| 2 < c\\n 2 ff w<T , = c 



< C{d/T) T R 2{l - T) 



leATa(T) i=l 



E c «i' 

JeAT 2 (T) 



< C 

< C(d/T) T R 2 ^- T \ 



ieJV 2 (T) 



This completes the proof, since the last term tends to zero as T — > 00. 



-B.2. Proof of Lemma 2 



Let us introduce IIj/ = J2ke{i 2 J ] d a J,k l P J,k- We first decompose the empirical coefficients as 
follows: 

-y n 1 n 1 n 

Sj,fe = -y^¥?J,fc(ti)a:j = - y~"yj,fc(ti)/(tj) + - yVj,fc(ti)& : = «J,fc + 



Then, using standard arguments, we have 
4 
1 



i=i 



n/-n/ n 



- 33 (|| E ( a J,k - ®J,k)pj,k +1 ^ ej,fc^J,fc 
fce[i,2 J l d fce[i,2 J l d 



+ 



n/ - iw 



with ||n/ — rij/|| 4 = 0(2 4Ja ). Furthermore, by well-known properties of wavelet bases 
(Cohen, 2003) and the Rosenthal inequality, 



I 4 /n2Jd\ 

E f\\ E (<*J,h-&J,h)<PJ,h 4 = 0(2 Jd )YE f (aj >k -aj tk ) A = 0(^) 
fce[i,2 J l d ft ^ ' 



and 



II 4 / 1 n 

E\\ £ ej^j,* ^ = 0(2^) ^ S[6 4 Jifc ] = 0(2^)^S( ^^^(t 



fce[i,2 J 



i=i 



2 2Jd ( - + 



1 2 



Jd 



Minimax testing of hypotheses defined via quadratic junctionals 31 

Finally we obtain, uniformly over / G S, Ef\\Uf - Uf n \\ 4 = 0(^- + ^ + 2' 4JcT ) , and the 
announced result follows. 

Appendix C: Proof of Proposition 4 

We are going to check that all the assumptions of Theorem 1 and Theorem 2 are satisfied. 
We can use the Sobolev embedding theorem (Kolyada, 1993) for [C7]: if a > d/4, then [C7] 
is satisfied. For the pilot estimator proposed in subsection 3.2, [C6] holds as well. Since the 
Fourier basis is uniformly bounded, checking [C3] is straightforward. 

Let now T n ^ — (C*r^) ^(1 -\- 2/^ ^), where and O* are defined in Proposition 4. We will 
show that 

• T„ j7 satisfies (15), 

• r* 7 defined by (16) satisfies r* ~ C**r*, 

• conditions [CI], [C2], [C5], [C8] and [C9] are fulfilled. 

To this end, we need an asymptotic analysis of the terms 

I (T) = % % = E «(« - % 

l&L d ' ' l& d 

and l2(T) = Ii(T) — Io(T). For the first one, it holds that 

j=i i=i 



For every i G {1, . . . , d}, we set 

TT< 1 , 2-rrli k 

m* = , 7j = -. and xi = — — = — . 

2vr' ' 2eri(l-5) ' T^i m, 

Note that, as 5 < 1, we have 7j > 0. With this notation, 

d d „ 

I (T) = T^mi -...•m^dl - ^ |x M | 2CT *)^/(mi • . . . • m d ). 

As mj -> 00 for every i, we can replace the sums by integrals 

Next, we make the change of variables yj = x^ 1 , j = l,...,d and set V = {y € 
Eti^<ntiyf M }.Weget 

4t)q- + rf j j 

TW=sJs r / " Hi " X 2 J — 1 J — 1 



i=l j=X 
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Now, we make another change of variables: Z{ = y«(rij=i V? ) Note that Y\i=i \ = 
(IliLi yf 1 ^ 1 ) 1 S ■ Therefore, using the notation = {z G : ||z||i < l}, 

45a+d j . , , „ , 

rp 2(1 _ ft) - /» a a i 4cr + a — 2dcT 1 -. 1 -, 

/o(T)^^--/ (n^)"^(l-H) 2 ^ ...^A(z)dz, 
where A(z) is the Jacobian. Standard algebra yields A(z) = ( Yli=i z" l//<Jl ) rf ^ 1 /(l — <5). 

4<5o-+d 

Next we give an explicit form for this integral Io(T) ~ ir~ d T 2 ( 1 ~ s )' i I , where 

-I j- , d o, i 4a + d 1 -. _l_ -, 



;nf=i^)(i-^) ^ 



r , s. N 4 °-+ a i i _j i 

/ (n^o^a-Ni) 2 ^— a. 

i= i 



Now, Liouville formula (see for instance Ingster and Stepanova (2011)) combined with the 
well-known identity f ii a_1 (l — u) 13 ^ 1 du = r(a)r(/3)/T(a + j3) yields 

Y[ d v( u ai \ 

j_ lli=l L \2a i "i o-j 2ct(1-5); 



) Z' 1 n d I (4a+d)^ 1 

— / (1 - u) 2 u 2 ° + 2 °^- s ) du 



;nti^)( l -^) r (^ + ( 2 + 

2lltir(«i) 2]lti r (^) ._ 2vr rf C((i, < T,«) 



( nti *o a - w« + 3) ( nil *i) a - + 2)r(« +2) « + 2 

Therefore, 

v ' ' j t*w. 

Very similar computations imply that, as T — > oo, we have 

45a + d , 

Ji(T) ~ - — j— , . ^l±^ll = C(d,(T,a)T^W. 

* d (nt 1 ^)( l -w«+2) 

Note now that (15) is equivalent to n 2 T 2 I (T) ~ 8T 4 (/i(T) - I (T)) 2 z 2 _ j/2 . Using the 
asymptotic equivalents for Jo and h we have derived above, one directly checks that the 
value of T„ j7 proposed in Proposition 4 satisfies (15). Furthermore, since (16) is equivalent to 
( r n, 7 ) 2 = h(T n ,"f) /T na l2{T nn ) , we get r* 7 = C*r* (1 + o(l)), as announced in proposition. 

4j(l-g) 

It remains to check that for the sequence T nn x. n ±°+ d conditions [CI], [C2], [C5], [C8] 
and [C9] are fulfilled. Using the same method as the one used above to evaluate Iq, we get 

2d x „ 2(4S<y+d) 

\N{T nn )\xn—* and M(T n , 7 ) = X ™ 45+d • (40) 

«eAf(T„, 7 ) 

The assumption a > d/A implies |AA(T„ )7 )| log \M{T na )\ = o(n) and, as a consequence, con- 
ditions [C4] and [C8] are true. Furthermore, the second relation in (40) combined with 
5 < 1 implies [C2]. Condition [C5] follows from the fact that all the nonzero entries of q are 
lower-bounded by 1. 

In order to check [CI] and [C9], we need to find an upper bound for max; g ^/-( Tn \qi. In the 
following calculations, the term C is a constant which depends only on d, a and a and can 
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vary from line to line. Let I € M(T), then q < Tqi, which implies, for every i = 1, . . . ,d, 
^(^-at) < CTW^ lf 3 . In particular 

z 2( CTl - ai ) < CT "Q jtoi ^ ^2( CT2 -a 2 ) < CT "Q ^ ^ ^-oa) < cr J-J jtoi _ (41) 
3^1 

Injecting the first inequality of (41) in the second one, we obtain 
Hence 

2 < C (r JJ ) 1 -° l/ffl - a2/ff2 (42) 

and by symmetry, 



L i>3 

Next, using (42), (43) and the third inequality in (41), we get 



3 — \ J-J-i>4 ■? 



Iterations of the previous process lead to the inequality max.,- lp < CT 1 /( 1 - S l Therefore, 

max ZeA/ - (r)(?i = CUU l T 3 ^ CTT ^- C 
yields the inequalities of [CI] and [C9]. 



A 2a -J— 4g(l-3) 

max ieA^(T) Qi = C-Y[j = \lj 3 < CT 1 -*. Combining this bound with T ni7 x n 4 °+ d and (40) 



Appendix D: Proof of Proposition 5 

As in the previous subsection, we begin with the calculation of Jo- Setting xi j = 27r ft and 
using the same method to get an integral, we have 

!o = T ^ E[Nll2-(^W 

d+4 

~S/ f«-(/3 T x) 2 - 
(27r) a jRd L 

This implies the asymptotic relation Jo ~ CqT^- 1 with the constant Co = J K d [ J2i=i( x i~ 

aE^i^i) 2 - Eti^l+dx- Similar computations yield h ~ C^T^A*- 1 ) and I 2 ~ 
C 2 r( d+4 )/( ,J - 1 ), where d and C 2 have the values given in the paragraph preceding the propo- 
sition. 

The rest of the proof can be carried out exactly in the same way as the proof of the previous 

d d+4 

proposition, based on the relation N(T) x T^- 1 and M(T) x T»- 1 . 



lx,ll 2<T 

l X «ll2cr 



\2a 
\2a 



dx. 
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Appendix E: Proofs of results stated in Section 4 



E. 1 . Proof of Theorem 3 

The arguments are almost the same as in the proof of Theorem 1. We use the array w n with 
entries wi = ^l{; eA ^( T )}/M(T) 1/2 and the kernel G„(ti,t 2 ) = 52iec w l ( Pl( t i)'Pl( t 2) in order 
to define the linear U-test statistic: 

-1/2 



2 ) ^ ^ Xi'Xj Cl n (tj. tj). 



l<i<j<n 

We write as U n = U„ ; o + U ni i + U n> 2, where 

-1/2 / \ -1/2 



2 ) ^^/;„;t,,t j i. ^,i=( 2 ) SCei/CtjO + ^/CtOJGnCti.t,-) 

' t<i ^ ' i<j 

and f7 nj 2 = (2) Si<j f fti) f (tj)G n (ti, tj). The first and the second moments of this U- 
statistic are described in the next result, in which we use the notation T w [/] = J2i w i&i[f]^Pl- 

Lemma 9. Let w n = (w^ n )i e c be an array containing only a finite number of nonzero 
entries and such that ^2i & £wf n = 1. Let £(w n ) be the support ofw n . The expectation of the 
U- statistic U n is given by: 

E f [U n ] = E f [U n , 2 ] = h n [f,w n ] = (^piy^wtfflf]. 

1 

Furthermore, if [D2] holds true, thenE[U^ ] = 1, E[U%i\ < 2Z?2||w n ||^ ||w n ||o||/||| and 

In 

Var[C/ n , 2 ] < ^IKOwJoll/ll! + y 11/ ■ T w [/]||1. 

Proof. This result can be proved along the lines of the proof of Lemma 3. The only difference 
is in the evaluation of the term A n ^,^ for which we have 

An ' 2 = n (n- 1) (3) £ wm'6i[f]W]{ I /(t)Vl(t)w(t)dt} 
= 2(n_2)| /• /(^(^^^(t^dt^^.T^m. 

This yields the desired result. □ 
Let us now study the type I and type II error probabilities of the test <p n {T) = 1{|c/„(t)|>u}- 

Evaluation of type I error Using Tchebychev's inequality, for every u > \E[U n (T)]\, we 
have 

sup P f (\U n (T)\ >u)< sup P f (\U n (T) -E[U n (T))\ > u - \E[U n (T)} 

Var(C/ n (T)) 

< sup 



} J {u-\E[U n (T)W 
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Let us denote ^ n ,T = 7iT~ 1 (2Af(T)) — 1 ' 2 . Using Lemma 9, we get 

\E[U n (T))\ < -j^==\ £ qie?[f] =Tu n>T \ «*?[/] 



ZeAf(T) 



Since, under Hq, we have Q[f] = J2iQi@i[f] 2 = an d Y2i c i@i[f] 2 — 1j the last sum can be 
bounded as follows: | £ /eA , (r) qi 6?[f]\ = \ £ /:N<Ci/T QiOflf] \ < T~ l J2i Q W < T -1 . Thus, 
|i?[C/ n (T)]| = |/i n [w n ,/]| < v ny T- Combining this bound with those of Lemma 9, we arrive at 

p ,. m 1 ^ 3(1 + 2£>i£>2^3 2 + ^i^2^ 3 4 + 2nZV(3M(T)) 
sup P f {(f> n {T) = 1) < — ? ^ 



B 1 + B 2 nM(T)- 1 
2(n - ^„,r) 2 



Consequently, if we choose it > i/ n ,T + (-Bi + B 2 nM{T) 1 ) 1 ^7 1 ^ 2 , then supj 6 jr () Pf((j) n (T) 



1)<1 

Evaluation of type II error Using similar arguments, we get 

sup P f (<p n (T) = 0)= sup P f (\U n (T)\<u) 
/e^i(p) /e-Fi(p) 

< sup F/d^^CT)]! - \U n (T) - E[U n (T)]\ < u) 
/e-^i(p) 

< sup P f {2- l l 2 Tv n , T \Q[f]\ - u n>T - \U n (T) - E[U n (T)]\ < u) 

< P{2- 1 l 2 Tu^ TP 2 - \U n {T) - E[U n (T)}\ <u + u n;T ). 
This can also be written as: 

sup Pf(MT) = 0) < P(\U n (T) - E[U n {T))\ > (2- 1 / 2 Tp 2 - l)u n>T - u 

Using the Tchebychev inequality and the evaluations obtained in Lemma 9, we get 

sup Pf{d> n lT) = 0) < k. 

feMP) f 2((2-V2 Tp 2 _ iKt _ u f 

Clearly, the right hand-side of this inequality is lower than 7/2 if 



P 2 > 



1/ B 2 n\ 1/2 



V2 V2 

+ 



This completes the proof of Theorem 3. 
E.2. Proof of Corollary 1 



0\ 



It is enough to remark that (since M(-) is increasing and T n < T„ 



w K ^ 1 < 1 

n ~ n " T° ~ T n 

and -7= < T" 1 . In view of these inequalities, the claim of the corollary immediately follows 
from Theorem 3. 
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E.3. Proof of Theorem 4 

We start by proving that the minimax rate of separation is lower bounded by ra" 1 / 4 . Let 
l a = argmin; g £ a {Q} for a G {+, — }. We define two functions /o and f\ as linear combinations 
of the basis functions ipi_ and ipi + . More precisely, fi = 0i-<pi_ + Oi t+ ipi + , for i = 0, 1, with 



1*4 



ci_\qi+ \ +Ci,\q L 



m. 



Q_qi+ +ci + \qi_ 



and, for some z > 0, 



n,- 



70,- 



9 2 



9 2 



n. 



One easily checks that /q G J-q and /i G Fi(r n ) with r 2 = zqi + /^Jn. Furthermore, the 
Kullback-Leibler divergence K{Pf Q ,Pf l ) = J log ■^ l dPf between the probability measures 
Pf and Pj x can be bounded as follows: 



K(P f0 ,P fl 



E E 



,J fo 



log 



dP 



, x n , t\ , . . . , t n/ 



X)(a:<-/i(t < )) 2 -(x < -/o(t < )) 2 

i=l 

n£[(/ Q (ti) - /i(t!)) 2 ] =«^ e{v} (^ 
n(% + - |0 2 + - zn-V2|i/2)2 < z 2 (2(?0i+) - : 



ti , • • • , t n 



To conclude, it suffices to use inequality (2.74) from (Tsybakov, 2009), which implies that 
7n(Fo,Fi(r n )) > 0.25e- z2 ( 29 ^' 2 = 7 for z = 20 o , + [ln(4 7 )- 1 ] 1/2 - 

It remains to prove the second assertion of the theorem. To ease notation, we write T n instead 
of T2 and set 



Qalf] = Y,, c r ^llf] and :F a = {f- Qa[f] = 0}, 

'eta 



for 



a G {+,-}. 



Let us assume that M + (T n ) > M_(T n ). We use the fact that testing Q[/] = against 
\Q[f]\ > r n 5 with / G S is harder than testing Q+[f] = against Q+[f] > r 2 , with / G J 7 -. 
The rest of the proof follows the same steps as those of the proof of Theorem 2. As indicated 
in Remark 4, we use as 7r n the simplified prior for which O^s are independent Gaussian random 
variables with zero mean and variance a/ = 2 r m+(t ) ~^-{ieC+uAf(T n )}- It i s an eas Y exercice to 
show that conditions [L1]-[L5] of Proposition 6 are fulfilled with 5 = 1/2. This completes the 
proof of the theorem. 
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